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Multiplex networks describe a large variety of complex systems, whose elements (nodes) can be 
connected by different types of interactions forming different layers (networks) of the multiplex. 
Multiplex networks include social networks, transportation networks or biological networks in the 
cell or in the brain. Extracting relevant information from these networks is of crucial importance 
for solving challenging inference problems and for characterizing the multiplex networks microscopic 
and mesoscopic structure. Here we propose an information theory method to extract the network 
between the layers of multiplex datasets, forming a “network of networks”. We build an indicator 
function, based on the entropy of network ensembles, to characterize the mesoscopic similarities 
between the layers of a multiplex network and we use clustering techniques to characterize the com¬ 
munities present in this network of networks. We apply the proposed method to study the Multiplex 
Collaboration Network formed by scientists collaborating on different subjects and publishing in the 
Americal Physical Society (APS) journals. The analysis of this dataset reveals the interplay between 
the collaboration networks and the organization of knowledge in physics. 

PACS numbers: 89.75.Fb, 89.75.He and 89.75,-k 


I. INTRODUCTION 

Multiplex networks mi describe a large number of 
complex systems where the interactions are of different 
nature. They are formed by a set of N nodes interacting 
through M different layers (networks). Recently, multi¬ 
plex networks have been used to characterize a large vari¬ 
ety of systems, including social networks [3], transporta¬ 
tion network [T] > collaboration networks and brain 

networks ]7|. Extracting relevant information from mul¬ 
tiplex networks is central for characterizing their micro¬ 
scopic and mesoscopic structure 0OD3, for solving chal¬ 
lenging inference problems, and for devising good cen¬ 
trality measures mm- 

Structural correlations are ubiquitous in multilayer 
networks and can be a powerful tool to extract infor¬ 
mation from them. For example the overlap of the links 
I14j in the different layers of multiplex networks has been 
observed in systems as different as in-silico societies [3], 
multilayer airport networks [1] or citation-collaboration 
networks [5j. Moreover it was recently shown Jjj) that 
using the information on the link overlap it is possible 
to extract information that cannot be extracted if the 
single layers are taken in isolation. Other examples of 
correlations encoded in multiplex network structures in¬ 
clude correlation between the degrees of the same node 
in different layers HU, and the activity distribution of 

the nodes mm 

All these structural correlations reflect local proper¬ 
ties of multiplex networks. Nevertheless, in complex 
networks, significant information is encoded in their 
mesoscale structure, i.e. their organization into several 
clusters or communities PHD!. 

Recently new modularity measures for multilayer net¬ 
works [8j have been proposed and new multiplex com¬ 
munity detection algorithms have been formulated [19] 
based on methods devised for single networks [T7]. Al¬ 


ternatively inference methods have been proposed to de¬ 
compose a single network in different layers with dis¬ 
tinct community structure m or to visualize multiplex 
networks m- Moreover it has been recently observed 
that the communities on different layers of a multiplex 
networks typically overlap among each others, forming 
mesoscale structures that span across different layers. 
This phenomenon is central for generalizing the concept 
of community to multilayer networks mm and model¬ 
ing the emergence of communities |22l . 

In this paper our aim is to characterize the correlations 
of multiplex networks at the mesoscopic scale, and to use 
this information in order to build a network between the 
layers of multiplex datasets. In particular we propose an 
information theory measure 0 s , able to define similar¬ 
ities between the layers of a multiplex respect to their 
mesoscopic structures. This similarity is more significant 
when groups of nodes densely connected with each oth¬ 
ers are simultaneously present on different layers, forming 
overlapping communities. This measure is based on the 
concept of network entropy [23U25] and extends the 0 
measure presented in [26] - Using the similarity 0 s , here 
we propose a method for extracting the network between 
the layers of multiplex networks. We apply the proposed 
method to the characterization of the American Physi¬ 
cal Society (APS) Collaboration Multiplex Networks ex¬ 
tracted from the APS dataset [27] , The scientific collabo¬ 
ration networks have been studied extensively in the con¬ 
text of single networks [281132] , Nevertheless, additional 
relevant information can be extracted if they are ana¬ 
lyzed as a multilayer structure [SUSIES]- The Collabora¬ 
tion Multiplex Networks are formed by the authors of the 
APS papers, and by layers corresponding to the Physics 
and Astronomy Classification Scheme (PACS) codes [33]. 
In particular two authors are linked on layer a if they 
have co-authored a paper with PACS code correspond¬ 
ing to layer a. Since the PACS codes are organized in 
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hierarchical levels we constructed two APS Collaboration 
Multiplex Networks corresponding to layers describing 
either the first or the second level of the PACS hierar¬ 
chy. The analysis performed on the APS Collaboration 
Multiplex Networks has allowed us to characterized the 
network between the layers of these multiplex networks, 
and to investigate the same dataset at different levels of 
resolution with respect to the number of layers. 

The paper is structured as follows: in Section II we de¬ 
fine the indicator measure 0 s ; in Section III we test the 
measure on two different multiplex benchmark models of 
two-layer network with communities; in Section IV we 
use our measure to analyze the community structure of 
the APS Collaboration Multiplex Network at two hierar¬ 
chical levels of the PACS code; in Section V we compare 
the results obtained with 0 s with results obtained using 
other similarity measures on the same dataset; finally in 
Section VI we give the conclusions. 


II. DEFINITION OF 0 s 


Including other features of the nodes to define node 
classes could be a viable option. In this case the char¬ 
acteristics q a will take into account different features 
which might depend on the specific network under con¬ 
sideration. Therefore here we take the class pf to be a 
function of degree kf and of the characteristic qf , i.e. 
Pi = f{kf,qf). The block structure of the network 
induced by the classes pf = f{kf,qf) is described by 
the matrices e a of elements e a (p,p') indicating the total 
number of links on the layer a between nodes of class 
p and nodes of class p'. We define the entropy T, k c qa 
nZlTHM] of a layer a as the logarithm of the number of 
graphs preserving the block structure e“ in a given layer. 
By considering the number of graphs preserving a given 
block structure, we have that this entropy takes the sim¬ 
ple expression, 
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Our goal here is to construct an information theory 
indicator function 0 s to characterize the similarity in 
the mesoscopic structure of the layers of a multiplex net¬ 
work. This indicator function is based on the entropy 
of network ensembles [251 - I2H] . a quantity which plays a 
key role when inference problems are addressed using an 
unbiased information theory approach [251126j . In this 
section we define how the indicator function 0 s is de¬ 
fined. We consider a multiplex network formed by N 
nodes i = 1,2..., iV and M layers a = 1,2,..., M. The 
structure of the multiplex network is characterized by M 
adjacency matrices a“ of elements a“ = 1 if node i is 
connected to node j in layer a , or a“ = 0 otherwise. We 
indicate with kf the degree of a node i on layer a, i.e. 
the number of neighbors that node i has on a. The nodes 
having degree kf = 0 in layer a, are the isolated nodes, 
i.e. nodes that are not connected to any other node in 
the layer a, also called [6] in the context of multilayer 
networks “inactive” nodes in layer a. Conversely all the 
nodes with kf > 0 are called “active” nodes in layer a. 

We assume that each node i of layer a has a char¬ 
acteristic qf € {1,...,Q Q }. The quantity qf can for 
example indicate the community to which the node i be¬ 
longs. More in general qf can represent any feature of the 
nodes in layer a. Starting from this information we can 
classify the nodes in P a classes pf £ {1,..., P a } which 
take into account at the same time the information about 
the degree of the nodes and their characteristic qf. This 
is the minimal assumption to capture the structure of 
networks with communities induced by the characteris¬ 
tics q a = {qf }i=i, 2 ...,jv, and strong heterogeneities in the 
degree. Considering only the partition induced by the 
characteristics would imply that in the network we do 
not consider the structure induced by the degrees, which 
is clearly not a viable option for broadly distributed net¬ 
works. 


where 

e“(p,p') = Yl a ij^\Pi{kf,qf),p\S [pf(kf,qf),p'] , (2) 

h3 

for p ^ p' , and e(p,p), n(p) given respectively by 
e“(p,p) = J2af j 5\pf(kf,qf),p\5[pf(kf,qf),p\, (3) 

i<j 

and 

n P = ( 4 ) 

i 

with 5[x,y] indicating the Kronecker delta. The entropy 
E k a ,q a is a measure to assess how much information is 
encoded in the constraint imposed to the network i.e. 
the block structure e“. The smaller is the entropy the 
smaller is the number of networks that share the block 
structure e“. Therefore the smaller is the entropy of an 
ensemble the larger is the level of information encoded by 
the constraint. If for a given assignment of the character¬ 
istics {qf} the entropy is much smaller than in a random 
hypothesis (when the characteristics are reshuffled ran¬ 
domly between the nodes), then the network structure 
reflects the characteristic assignment {qf} and thus the 
characteristics {qf} capture relevant information respect 
to the network structure. Following this argument the 
quantity 0 proposed in [25] , which is based on the en¬ 
tropy of network ensembles, has been shown to be an 
unbiased indicator able to quantify the specificity of a 
generic layer a to the assignment qf. This information 
theory quantity is defined as: 

P _ _ A,r[E ^ 
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where is the expected value over random uniform 

permutations 7r(g“) of the node characteristics q a in layer 

a. 

Here we propose to use this quantity to compare the 
similarity between the different layers in a multiplex net¬ 
work. Indeed we can consider the characteristics q 13 of the 
nodes in layer P as an induced feature of nodes in layer a 
and measure by the corresponding indicator Q k a q f 3 how 
much information the characteristics q@ contain respect 
to the node structure of layer a. In particular the indi¬ 
cator Q ka q p is given by 


0 


k a ,qf> 


,7r(q^)] 'k a ,qP 

\J -£%[(£it,ir (qP) ~ ^[^fc“,7r(g' 3 )]) 2 ] 


( 6 ) 


Therefore measures the specificity of the layer a 

respect to the particular set q@, which is the assignment 
of the characteristics of the nodes on layer p. 

When one considers a single layer, the entropy is inde¬ 
pendent on the choice adopted for classifying isolated (in¬ 
active) nodes in layers belonging to multiplex networks. 
In fact, we can either group all the isolated nodes in a sin¬ 
gle class or each isolated node in a different class, and the 
entropy value given by Eq. (|T|) does not change because 
the isolated nodes have no links attached to them. In¬ 
stead the indicator function Q k a qa might depend on this 
choice because its construction involves several reshuf¬ 
fling of the characteristics of the nodes. 

When comparing different layers of a multiplex net¬ 
work, the nodes that are active in one layer might not 
be active in another layer. Nevertheless, the information 
carried by the activity of the node might be significant. 
For example if two layers have very different activity pat¬ 
terns, it might occur that the nodes inactive in one layer 
form a well defined cluster in the other layer resulting in 
a very significant information that is important to cap¬ 
ture. Therefore to distinguish between nodes active and 
inactive in a layer it is a very convenient choice to classify 
all the inactive nodes in one layer under a given common 
characteristic. A similar type of argument can be made 
about connected clusters of small sizes, which are “quasi- 
isolated” as the nodes belonging to connected clusters of 
size 2 or 3 etc. Depending on the number of such clus¬ 
ters it might be convenient to classify also nodes in con¬ 
nected components of size 2 or 3 etc. into given common 
characteristics as we will show in the next sections using 
the concrete examples of the APS Collaboration Multi¬ 
plex Networks. Here, if not stated otherwise, we will 
consider the case in which the features q a indicates the 
community of the nodes in layer a and the characteristic 
pf takes a different value for each distinct pair (k?,q]) 
where kf ^ 0, while all the nodes with kf = 0 form 
another class of nodes. 

In order to compare the level of information carried in 
layer a by the community structure in layer /3, qP , with 
the level of information carried by the proper community 


structure, q a , we define the quantity 


0a,/3 


0/c a ,qP 
0fc a ,g a 


(7) 


The quantity 0 a ,/3 is a measure of how layer P is similar 
to a respect to the community assignment q. If 0 a ^ = 1 
the community structure qP, proper of layer p, carries the 
same level of information for the structure of layer a as 
the community structure q a , proper of the layer a. It 
is important to notice that the matrix 0 in principle is 
not symmetric. We can construct the symmetric measure 
0f p by symmetrizing the quantity Q a ,p he. by defining 


aS _ + 9 / 3 ,a 

U a,/3 - 9 


( 8 ) 


This is a symmetric measure indicating how similar 
layer a and layer P are with respect to their commu¬ 
nity structure. In Figure [l] we give a schematic summary 
of the method used to construct the similarity measure 

In a given multiplex network, we can then analyze the 


entire symmetric matrix 0 s measuring the similarity be¬ 
tween the community structure of the layers. This matrix 
characterizes the entire multiplex network at the layer 
level, reducing the information about the network struc¬ 
tures to one matrix of similarity between the layers. 

In the following Section we will first test this measure 
on multiplex network benchmark models with non trivial 
community structure, then in the subsequent Section we 
will focus on characterizing the APS Collaboration Mul¬ 
tiplex Networks where the layers are the collaborations 
networks of scientists using different PACS numbers. 

In this paper we are mostly concerned about similar¬ 
ities in the community structure of the layers of a mul¬ 
tiplex network, nevertheless it has to be stressed that 
the proposed approach and similarity measure 0® ^ is 
general and it can be used by considering any available 
feature of the nodes related to the structure of the layers. 


III. TESTING 0 s ON BENCHMARK MODELS 


In order to validate on a well defined multiplex archi¬ 
tecture our similarity measure 0 s respect to the commu¬ 
nity structures of different layers of a multiplex network, 
we have developed two benchmark models with commu¬ 
nities. In particular we want to construct benchmark 
multiplex network models with a controlled level of over¬ 
lap between the communities in different layers. Given in 
a generic multilayer the community assignment q a of the 
nodes on each layer a , we define the community overlap 
as 


O c 


max 


M(M - 1) N M 


! m n 

££* 

a</3 i—1 



(9) 









4 



b) 


E7t[£k“7t(q a )l 



c) 



FIG. 1: (Color online) Diagram showing the method. Panel 
a): We consider a layer a? in a multiplex network and we define 
the node classes p a = (k a ,q a ), where k a indicates the node 
degrees and q a the node characteristics on the layer a. These 
classes induce a block structure in the network specified by the 
number of links between the nodes of each class and the num¬ 
ber of links connecting the nodes in different classes. Panel b): 
The entropy E^a^a given by Eq. ([Tj) is calculated and com¬ 
pared with the entropy distribution obtained in a random hy¬ 
pothesis, by performing random uniform permutations n(q a ) 
of the characteristics q a of the nodes and subsequently mea¬ 
suring the Efca i7r ( 9 a) values. The mean E n and 

standard deviation ov [Efct* i7r (^*)] of the entropy distribution 
is thus calculated. The indicator function Q k a qa measures 
the difference between E k a qa and E n [E fe c j7r ( g c<)] in units of 
ov [Ej.o, 7 r(q“)] • Panel c): Given a second layer /?, Q a ,p charac¬ 
terizes the information about the structure in layer a, carried 
by the characteristics of nodes in layer /3. In order to define 
a symmetric indicator function of the similarity between the 
layers a and f3 we define the indicator 0f ^ that symmetrizes 
the indicator function Q a ,p. 


where M indicates the total number of layers and N in¬ 
dicates the total number of nodes, 6[x,y] indicates the 
Kronecker delta and the maximum is taken over all the 
permutations of the label of the communities in 

layer /3. 

We define two benchmark models (see Figure [2]) based 
respectively on the Girvan and Newman (GN) [34) model 
and on the Lancichinetti - Forunato - Radicchi (LFR) 
model [35] . which are very well established benchmarks 
for single networks with communities. The proposed 
benchmarks are designed to tune the overlap of commu¬ 
nities between different layers of simple multiplex net¬ 
works having respectively homogeneous or heterogeneous 
degree distribution and community size distribution. 

For the first benchmark model, the Duplex Network 
GN model (DNGN) we construct a duplex network (a 
multiplex network made of two layers) in which each layer 
is formed by a GN network realization. Therefore each 
of the layers is formed by N nodes divided into 4 equal 
size clusters of size N c . 

The network in each layer is a random network in 
which each node has a probability Pi n to link to nodes 
of its same community and a probability p out to link to 
nodes outside its community. In particular we have cho¬ 
sen pi n and p out in order to have for each node, a mean 
degree (k) = 16 and a mean number of links outside 
the community given by ( k ou t ) = 4. The layers gener¬ 
ated in this way have a well defined community structure 
and they are essentially random respect to other net¬ 
work characteristics. The characteristic g“ indicates the 
community to which a node i belongs on layer a = 1, 2. 
Here we consider the possible correlations existing be¬ 
tween the community assignment and q^ in the two 
layers. This community assignment allows us to tune in 
a control way the level of overlap between the communi¬ 
ties. In particular we label the nodes i = 1..., N in layer 
1 according to the following community assignment r/“, 




[i] 



( 10 ) 


where the brackets \x] in the right end side of this ex¬ 
pression indicate the ceiling function of x. Therefore we 
have, for N = 128 and N c = 32, 


( 1 for i G [1,32] 

I 2 for i G [33,64] 

] 3 for i G [65,96] 

{ 4 for i G [97,128] 


The community assignment in layer 2 will not be in gen¬ 
eral the same of layer 1. In order to model overlap of 
communities we perform a simple “shift” of the labels, 
parametrized with the parameter p > 0. In particular we 
take 
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In general the control parameter p takes values 0 < 
p < 0.5. If p = 0 there is no “shift” between the layer 
partitions (they perfectly match); if p > 0 each com¬ 
munity in the first layer overlaps with the corresponding 
one in the second layer for a fraction of nodes equal to 
(1 — p) ■ N c ; thus p ■ N c is the number of “shifted” nodes 
per community. When p = 0.5, N = 128 and N c = 32, 
we have 


Qi = 


1 for i G [17,48] 

2 for i G [49, 80] 

3 for i G [81,112] 

4 for iG [1,16] U [113,128] 


Therefore p = 0.5 describes the maximum “shift” be¬ 
tween the community of the two layers: each community 
in the first layer shares 16 nodes with its corresponding 
community in the second layer. Given a value of p the 
overall community overlap in the network can be easily 
calculated, being O c = (1 — p), and in the case of maxi¬ 
mum “shift” we obtain O c = 0.5. 

For the second benchmark model the Duplex Network 
LFR model (DNLFR), we have taken a duplex network 
in which the single layers are constructed according to 
the LFR model [35]. 

1. The network in the first layer is a LFR network, 
formed by Q communities. The communities are 
labelled according to their size in descending order. 


2. The network in the second layer is a LFR network 
with Q communities generated using the same pa¬ 
rameters used for the network in the first layer. Ad¬ 
ditionally we require that the network in the second 
layer satisfies a further condition, which allows us 
to modulate the overlap between the communities 
in the two layers. Specifically, for each second layer 
candidate, we first label the communities according 
to their size in descending order. Then we compare 
each of them to the corresponding one in the first 
layer (panel B Figure [2]). We calculate the num¬ 
ber of “shifted” nodes N s given by the sum of the 
absolute values of the difference between the corre¬ 
sponding communities sizes, i.e. 


A 


DNGN benchmark model 


layer 1 


n 


layer 2 


I 


P-Nc (l-p)-Nc 


B 


DNLFR benchmark model 

1 I 3 | 2 | 4 | 5 l laver 1 

1 I 3 I 4 | 5 | 2 llavpr 2 



FIG. 2: (Color online) Schematic of the benchmark mod¬ 
els DNGN and DNFLR. Panel (A). The DNGN benchmark 
model: nodes on both layers (blue and red) are divided into 
four communities of equal size N c , labelled from 1 to 4. Each 
community of layer 1 overlaps for a fraction of (1— p)- N c nodes 
with its corresponding community in layer 2. Panel (B). The 
DNLFR benchmark model: on each layer Q = 5 non homo¬ 
geneous communities are generated and labelled from 1 to 5 
according to their size (left). For a given p the total number of 
nodes which do not overlap between communities of the same 
label, N a , has values |_(p—Ap)-S m *„J < N s < [(p+Ap)-S m i n i, 
where [_■■■] is the floor function and S m in is the minimum 
bound of the power-law distribution from which the commu¬ 
nity sizes in the two layers are extracted. 


In this way if one considers a sufficient number of 
multiple realizations of the multilayer, and a suffi¬ 
ciently low value of A p, one gets 

{N s )~[p-S min \. (13) 


Q 


1=1 


( 11 ) 


where nf is the size of the community l in layer 
a. Finally we retain the candidate network as the 
second layer of the duplex network only if 

lip ~ A p) ■ S min J <N S < Up + A p) ■ S min J, (12) 

where . .j is the floor function. Here p and A p 
are control parameters of the benchmark model 
that modulate the overlap of the communities, and 
S m i n in Eq. (121 is the parameter that in the LFR 


model fixes the lower bound of the community sizes. 


3. Finally, the nodes are relabelled in both layers in 
order to allow the maximum community overlap. In 
particular the labels are reassigned in such a way 
that the common number of nodes in the commu¬ 
nities that have the same label in the two layers, is 
equal to the minimum of the two community sizes, 
(see Figure [2]) 

Therefore the average community overlap of the 
benchmark network is dependent on p and, for a 
significant number of realizations and low enough 
values of A p, is given by 


m = 1- !A>=i- kA~d (14) 
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FIG. 3: (Color online) The similarity measure 0 s between the 
two layers of the DNGN (blue diamonds) and DNFLR (orange 
circles) benchmark models is measured as a function of the 
control parameter p. When p increases the total community 
overlap between the layers decreases and 0 s decreases mono- 
tonically both in the case of homogeneous-size communities 
(DNGN) and in the case of heterogeneous-size communities 
(DNFLR). Each data point is averaged over 50 benchmark 
realizations. For the DNFLR model the parameter A p was 
set to 0.05. 

In order to test the performance of the similarity mea¬ 
sure 0 s , we apply this measure to the two duplex net¬ 
work benchmarks, for different values of p. Since p mod¬ 
ulates the level of community overlap between the lay¬ 
ers we expect that the similarity measure 0 s is larger 
for lower value of p (corresponding to larger community 
overlap O c between the layers) and smaller for larger val¬ 
ues of p (corresponding to smaller community overlap O c 
between the layers). In Figure[3]we show the dependence 
0 s as a function of p for the two proposed benchmark 
models. In both cases the displayed values 0 s are aver¬ 
aged over 50 benchmark realizations. 

For the DNGN benchmark, we considered N = 128, 
N c = 32 and p < 0.5. The similarity measure 0 s is 
monotonically decreasing with p. For the DNLFR bench¬ 
mark the two single layers are generated according to the 
LFR algorithm with parameters N = 600 (number of 
nodes) and Q = 5 (number of communities). The size of 
each community is taken from a power-law distribution 
with lower bound S m i n = 60, upper bound S max = 180, 
and power-law exponent t\ = 1.5. inside the communi¬ 
ties the node degree distribution is also extracted from a 
power-law distribution with parameters k max = 50 (max¬ 
imum degree), r 2 = 2.6 (power-law exponent), (k) = 16 
(average degree). For building the DNLFR network we 
used A p = 0.05 and p < 0.95. Also in the case of the 
DNLFR benchmark, where the size of the communities 
is heterogeneous, 0 s decreases monotonically with p. 

This result shows that in benchmark models in which 


the community overlap is modulated by an external con¬ 
trol parameter, 0 s decreases together with the commu¬ 
nity overlap. Since in general measuring the community 
overlap involves an optimization over a permutation of 
the community assignment, measuring the community 
overlap can be very costly numerically. In this situation 
calculating 0 s could instead give an alternative way to 
assess the similarity between the layers of a multiplex 
network. 

In Section V, using the concrete examples of the APS 
Collaboration Multiplex Networks, we will compare the 
similarity measure 0 s to other existing measures intro¬ 
duced to compare different community assignments in 
single layers. 

IV. THE NETWORK BETWEEN THE LAYERS 
OF THE APS COLLABORATION MULTIPLEX 
NETWORKS 

In this Section, we use the similarity matrix 0 s to an¬ 
alyze the APS Collaboration Multiplex Networks. These 
multiplex networks are extracted from the APS collab¬ 
oration dataset m recording all the bibliometric infor¬ 
mation about the papers published in the APS journals. 

The network is formed by a set of N nodes representing 
the APS authors. Since there is no agreement on disam¬ 
biguation techniques for the author names, we have iden¬ 
tified each author with the initials of his/her first name 
and last name. The layers correspond to different Physics 
and Astronomy Classification Scheme (PACS) codes [33] 
describing the subject of the papers. Two authors are 
linked in a given layer a if they are co-authors of at least 
one paper having the PACS number corresponding to 
layer a. Since PACS numbers are organized in a hier¬ 
archical way (the first digit of the number indicates the 
general field of physics while the second digit specifies the 
ambit inside that field), we have constructed two multi¬ 
plex networks whose layers correspond respectively to the 
first and second hierarchical level of the PACS codes. The 
APS Collaboration Multiplex Network related to the first 
level of the hierarchy of PACS codes is made of Mi = 10 
layers each one describing the collaboration network in a 
general field of physics. The APS Collaboration Multi¬ 
plex Network at the second level of the hierarchy is made 
of M 2 = 66 layers each one describing the collaboration 
network in a specific ambit of physics (second level of the 
PACS code hierarchy). 

In extracting the APS Collaboration Multiplex Net¬ 
works we considered all the papers until 2014 with less 
than ten co-authors. This threshold was introduced to 
exclude papers coming from big collaborations that fol¬ 
low different statistical properties with respect to the rest 
of the dataset. With this threshold, our dataset includes 
a consistent fraction of the whole dataset (~ 97% of 
the total number of papers) and a number of authors 
N= 180,539. 

The layers of the APS Collaboration Multiplex Net- 
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works are characterized by a significantly different ac¬ 
tivity pattern of the nodes. Moreover roughly 0.7% of 
nodes belong to connected components of size 2 while 
only about 0.006% of the nodes belongs to connected 
components of size 3. Therefore we consider here the 
case in which the characteristics {(/“} indicate the com¬ 
munity of the nodes in layer a and the class pf of node 
i in layer a takes a different value for each distinct pair 
(fc“, qj ) as long as the node % is not isolated fcf > 0, and it 
belongs to a community of more than two nodes. All the 
isolated nodes belong a the same class p. All the nodes 
belonging to a two-node community belong to another 
class p. 

Let us first characterize the mesoscale similarities be¬ 
tween the Mi = 10 layers of the APS Collaboration Mul¬ 
tiplex Network in the main subjects of physics, described 
by the first level of the PACS code hierarchy. The similar¬ 
ity matrix 0 s is constructed in two different ways, using 
either the Informap community detection algorithm [36j 
and the Louvain algorithm mi and averaging in both 
cases over 350 random permutations of the community 
assignments. For simplicity we will refer to these two 
matrices as Infomap-0 S and Louvain-© 5 . The two ma¬ 
trices are reported in Figure [4] in the form of heat-maps. 
The patterns shown by the two heat-maps are very sim¬ 
ilar, denoting that from a qualitatively point of view the 
measure 0 s is not affected by the choice of the algorithm 
used to perform the community detection for the network 
under study. We can observe that, in general, clusters in 
the APS Collaboration Multiplex Network extend across 
multiple layers. As expected, layers describing collabora¬ 
tions in general or interdisciplinary fields such as General 
Physics or Interdisciplinary Physics, which often involve 
people from different specific ambits of physics, show high 
values of Q s respect to several other layers while more 
specific fields, such as Gases&Plasma, show lower values 
of 0 s respect to the other layers. 

Given this similarity measure between the layers of the 
multiplex, one can build a network of networks whose 
nodes represent the Mi = 10 networks of collaboration 
in general fields of physics and whose weighted edges are 
the values 0 5 /3 and represent the similarity between the 
Mi networks respect to their community structure. This 
network of layers is thus a weighted fully-connected net¬ 
work showing itself a significant community structure and 
revealing how the pattern of collaboration between sci¬ 
entists is organized across different fields of physics. In 
order to characterize this community structure between 
the layers of the multiplex network, we perform a hierar¬ 
chical clustering analysis starting from the dissimilarity 
matrix d of elements d a j 3 given by 


da,0 — 1 



(15) 


Specifically, we use the average linkage clustering method 
which gave the best cophenetic correlation coefficient 
compared to other clustering methods [55IHD) . According 
to the average method the distance d c (Ci,C 2 ) between 


two clusters C 1 and C 2 is defined as the average distance 
between all pairs of layers in the two clusters: 


d c (Ci,C 2 ) 
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(16) 


where J\f(Ci) indicates the number of layers in cluster C% 
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FIG. 4: (Color online) The similarity matrices of elements 
0f a calculated respectively using the Louvain and the In¬ 
fomap community detection algorithms are plotted for the 
APS Collaboration Multiplex Network with the M\ = 10 lay¬ 
ers indicating the collaboration network at the first level of 
the PACS hierarchy. Each layer refers to a general field of 
Physics (see Table [7] for the legend of the layer acronyms). 
The dendrogram between the layers is shown on the left of 
each matrix @f The dashed line on top of the dendrogram 
indicates the partition that correspond to the optimal value 
of the weighted modularity given by Eq. (171. 


In Figure |4j together with the matrices Infomap-0 5 
and Louvain-0 s we show the dendrograms resulting from 
the hierarchical clustering analysis of the respective dis¬ 
similarity matrices Infomap-d and Louvain-d. In order 
to define an optimal partition of the layers into commu¬ 
nities, we looked for the agglomerative stage of the clus¬ 
ter hierarchy at which the weighted modularity Q m is 



























Acronym 

PACS 

Field 

General-0 

00 

General 

Particles-1 

10 

Physics of Elementary 

Particles and Fields 

Nuclear-2 

20 

Nuclear Physics 

Ato&Mol-3 

30 

Atomic and Molecular 

Physics 

Classical-4 

40 

Electromagnetism, Optics, 

Acoustic, Heat Transfer, 

Classical Mechanics and 

Fluid Dynamics 

Gas&Pla-5 

50 

Physics of Gases, Plasmas 
and Electric Discharges 

Cond Mat 1-6 

60 

Condensed Matter: 

Structural, 

Mechanical and Thermal 

Properties 

Cond Mat II-7 

70 

Condensed Matter: 

Electronic Structure, 

Electrical, 

Magnetic and Optical properties 

Interd-8 

80 

Interdisciplinary Physics and Related 
Areas of Science and Technology 

Geo&Astro-9 

90 

Geophysics, 

Astronomy and Astrophysics 


TABLE I: The acronyms used in this study for the PACS 
number at the first level of the PACS hierarchy, the corre¬ 
sponding PACS numbers and corresponding general fields of 
Physics. 


maximized, Q defined as: 
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(17) 


where a a labels the community in which layer a is, <5[a;, y] 
indicates the Kronecker delta and r] a , (rj) are given re¬ 
spectively by 

Va = ^ ' |®a,/3 > 

/3#a 

to) = ( 18 ) 

a 

As shown in Figure [4] the optimal partition found is 
the same either when using the Infomap algorithm or 
the Louvain algorithm to perform the community de¬ 
tection in the layers of the multiplex. The analysis re¬ 
veals that the first layers clustering together are Con¬ 
densed Matter I&II and Interdisciplinary Physics and 
they form the first block (green coloured box); the sec¬ 
ond block includes General Physics, Classical Physics, 
Atomic and Molecular Physics (purple coloured box); in 
the third block Particles Physics, Nuclear Physics and 
Geophysics&Astrophysics group together (cyan coloured 
box). The layer related to Gases&Plasma Physics is iso¬ 
lated and can be considered as a block by itself. 

Once revealed the block (community) structure an 
interesting issue is to characterize the Minimal Spanning 


Tree (MST) that allows us to identify the layers which 
connect the blocks together. Therefore we construct 
the MST using the dissimilarity measure d defined 
in Eq. ( fl5] ) calculated either using the Infomap or 
the Louvain clustering algorithm. The two MSTs are 
identical (Figure [5]) and this confirm the robustness of 
the results with respect to the community detection al¬ 
gorithm used. We can see that the collaboration layer of 
General Physics connects the three main blocks together. 


Block 1 

Block 2 

Block 3 

Block 4 

Cond Mat 1-6 
Cond Mat II-7 
Interd-8 

General-0 

Ato&Mol-3 

Classical-4 

Particles-1 

Nuclear-2 

Geo&Astro-9 

Gas&Pla-5 


TABLE II: Clusters between the Mi = 10 layers of the APS 
multiplex network corresponding to the first level of the PACS 
hierarchy (see for the legend of the layer acronym Table [7|. 
The clusters have been obtained from the dendrograms shown 
in Figure]!} cut in order to obtain the partit ion that optimizes 
the weighted modularity Q defined in Eq. (171. 


Gas&Pla-5 




Geo&Astro-9 



Nuc 

ear-1 



Particle-2 
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Interd-8 

□ 



Cond Mat 1-6 

□ 



Cond Mat 11-7 



FIG. 5: (Color online) Minimal Spanning Tree (MST) using 
the dissimilarity measure d in the case of Infomap-d dissimi¬ 
larity (blue) and in the case of Louvain-d dissimilarity (ocher). 
The block structure obtained with the hierarchical clustering 
analysis is also showed. 
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Louvain Infomap 



FIG. 6: (Color online) Hierarchical clustering of the APS Collaboration Multiplex Network in which each layer represents a 
collaboration network in a specific area of physics, as described by the second hierarchical level of the PACS code. We show the 
two dendrograms obtained respectively from the Louvain-0f ^ (left) and from the lnfomap-0f p (right). In each dendrogram 
the communities found at the optimal partition (maximum of Q) are represented as branches of the same colors. 


In order to have a deeper understanding of the results 
previously found we now consider the multiplex network 
of scientific collaborations where the layers are related to 
the PACS code at the second level of the PACS hierarchy. 
For this multiplex network we have calculated the simi¬ 
larity matrix 0 s between the M 2 = 66 layers and found 
the optimal partition into communities according to the 
score function Q , following an analogous procedure to the 
one used previously for first level of the PACS hierarchy. 
To calculate 0f ^ we have performed averages over 350 
random permutations of the community assignments. 

In Figure[6]we plot the dendrograms resulting from the 
hierarchical clustering analysis in the case of Louvain-d 
dissimilarity and Infomap-d dissimilarity. For each den¬ 
drogram, the clusters found in the optimal partitions are 
represented as branches of the same colors. When using 
the Louvain-d dissimilarity we obtain six clusters plus 
some isolated layers. When using the Infomap-d dissim¬ 
ilarity we obtain four clusters plus isolated layers. Nev¬ 
ertheless we observe that two of the clusters (the red 
and the violet clusters) are identically the same in the 
two partitions. The other two clusters obtained with the 
Infomap-d dissimilarity are each divided into two clus¬ 
ters when considering the optimal partition using the 


Louvain-d dissimilarity. In particular the combination of 
the green-yellow and green-blue clusters in the Louvain 
partition is identical to the green cluster of the Infomap 
partition, while the combination of the orange and the 
yellow clusters in the Louvain algorithm is identical to 
the brown cluster of the Infomap partition. 

In Figure[7]we give an overview of the blocks hierarchy 
found. The four clusters found in the Infomap-d opti¬ 
mal partition matrix are represented by solid-line ovals. 
Dashed ovals split two clusters in two, according to the 
results obtained from the Louvain-d optimal partition. 
The block structure at the first level of the PACS hier¬ 
archy is shown using solid-line polygons. This method 
allows us to characterize with a bottom-up method how 
the organization of knowledge in physics is effectively per¬ 
ceived by scientists while shaping their collaboration net¬ 
work. We observe that while the PACS hierarchy clearly 
captures main features of the collaboration network, the 
analysis of the Collaboration Multiplex Network at the 
second level of the PACS hierarchy clearly suggests a hi¬ 
erarchical organization of these PACS numbers that is 
not equivalent to the first level of the PACS hierarchy. 
Finally we used the information gained by this analysis 
to construct the network of networks between the lay- 
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Gas&Pla-5 



FIG. 7: (Color online) Optimal community structure of the layers of the APS Collaboration Network in which each layer 
represents a collaboration network in a specific area of physics, as described by the second hierarchical level of the PACS 
code. The four communities found starting from the lnfomap-0f ^ matrix are represented by blue solid-line ovals. In the 
partition obtained from the Louvain-Of a two sub-communities (ocher dashed ovals) are considered separate communities. 
These communities form the course-grained partition into the three blocks found at the first hierarchical level of the PACS 
code (colored solid-line polygons). The nodes displayed in this figure correspond to a subset of 61 layers that are not isolated 
in the optimal partition in communities which optimizes the weighted modularity Q. 


ers of the Collaboration Multiplex Network at the sec¬ 
ond level of the PACS hierarchy. To this aim we have 
constructed the weighted network determined by an op¬ 
portune thresholding of the Louvain-© 5 or Infomap-© 5 
similarity matrix (see Figure [8]). The threshold, is here 
given by the minimum value of the similarity matrix @ 5 
that ensures that each layer is connected to at least one 
other layer of its own cluster. From these networks, it 
is possible to appreciate that, although the network be¬ 
tween the layer of the Collaboration Mutliplex Network is 


highly interconnected, the clusters found corresponds to 
layers much more similar between themselves than with 
other layers outside their own cluster. Interestingly this 
visualization shows that the two clusters detected only by 
the Louvain algorithm, [94,96] and [29,41,52,84], con¬ 
tain the nodes that act as bridges between the yellow- 
green cluster and the red and the orange clusters. This 
might explain why the Louvain algorithm identifies them 
as separate clusters. 
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Louvain Infomap 



FIG. 8: (Color online) The network between the layers of the APS Collaboration Multiplex Network (with layers corresponding 
to the PACS code at the second level of the PACS hierarchy) is displayed here for the two cases in which the Louvain-© 6 or the 
Infomap-0 S similarity matrix are used. The link weights represent the similarity between the community structure of the two 
linked layers. The networks are obtained from the 0 s similarity matrix by filtering out the links below a given threshold value. 
The threshold is chosen to be the maximal value that ensures that in the filtered network each layer is connected with at least 
one layer inside its own cluster. The architecture of the networks describes the interplay between the collaboration networks 
and the organization of knowledge in physics. The community structure revealed by the hierarchical clustering analysis is 
shown making use of the same color scheme of Figure [6] 


V. COMPARISON OF THE RESULTS 
OBTAINED WITH 0 s RESPECT TO OTHER 
SIMILARITY MEASURES 


In this Section we compare the results obtained from 
the analysis of the APS Collaboration Multiplex Network 
using the 0 s indicator with results from other similarity 
measures commonly used to compare different network 
partitions im and with the ACTIVS Index, an index 


able to capture the similarity of the layers of a multiplex 
due to the activity of the nodes. In particular, focusing 
on the highest level of the PACS hierarchy, we compute 
the Normalized Mutual Information 7VAf/j35], the Jac- 
card index J [32], the Rand index R 32 IS] and the 
ACTIVIS Index for each pair of the Mi = 10 layers. 
Given two network partitions X and Y, the Normalized 
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Mutual Information NMI, is defined as 


NMI (X,Y) 


2 [H(X) - H(X\Y)} 
H{X) + H(Y) 


(19) 


where H{X) = — P(:r) logP(a;) * s the entropy as¬ 

sociated to the distribution P(x) of sizes x of the clus¬ 
ters classified by the partition X ; H(Y) corresponds to 
the entropy associated to the distribution P{y) of the 
sizes y of the clusters in the partition Y; H(X\Y) is 
the conditional entropy associated to the distribution of 
the community assignment X conditioned on the distri¬ 
bution of the community assignment Y and is given by 

h(x\y) = -T lx , y p ( x ^y) l 2 3 °s p ( x ,y)/ p {y), p ( x ,y) the 

distribution of the number of nodes having community 
assignment x in partition X and y in partition Y. 

The Jaccard index J and the Rand Index R, are in¬ 
stead defined as 


J(X,Y) 
R {X, Y) 


an 

an + a±o + aoi ’ 
an + aoo 

an + a±o + aoi + aoo ’ 


( 20 ) 


where an is the number of pairs of nodes belonging to 
the same cluster in both partitions X and Y, aoo is the 
number of pairs of nodes classified in different clusters in 
both the X and Y partitions, and aio(aoi) is the number 
of pair of nodes belonging to the same cluster in X(Y) 
but belonging to different clusters in Y(X). 

Finally we define the Activity Similarity ACTIVIS 
Index between the layers a and /3 of a multiplex network, 
which compares the activity patterns in different layers. 
This index is given by 


ACTIVIS = bn+b 00 , (21) 


where bn are the fraction of nodes active in both layers 
and b 00 are the fraction of nodes inactive in both layers. 

In Figure [9] we show the similarity matrices for the 
different measures and their respective dendrograms, ob¬ 
tained with the same hierarchical clustering analysis dis¬ 
cussed above for the 0 s case. Here the layer partitions 
are obtained using the Infomap algorithm. When the 
modularity Q is optimized, the partition obtained with 
all these alternativejneasures are different from the one 
obtained using the 0 s indicator function. Moreover the 
partitions obtained are characterized by having at least 


3 out of 10 layers in separate clusters, resulting in signif¬ 
icantly less relevant partitions. Moreover, by looking at 
the dendrograms, we can see that none of the other mea¬ 
sure is able to give the optimal partition obtained with 
0 s even by applying an arbitrary cut to the respective 
dendrogram. 

These results show clearly that the proposed indicator 
function 0 s based on information theory, is not equiva¬ 
lent to previously defined similarity measures between 
partitions. Moreover the method is not affected sig¬ 
nificantly by the choice we made for treating inactive 
nodes or nodes belonging to connected components of 
two nodes. Although it might be a challenging technical 
problem to assess which of the similarity measures pro¬ 
posed so far is the best, the similarity measure 0 s seems 
to be more relevant of other similarity measures used 
in the literature when applied to the APS Collaboration 
Multiplex Networks. In fact the partition obtained by us¬ 
ing the similarity measure 0 s reflect much more closely 
the general perception of the organization of collabora¬ 
tions in the physics community. 


VI. CONCLUSIONS 

Characterizing the mesoscopic structure of multiplex 
networks is crucial to characterize large network datasets 
where the nodes are connected by different types of in¬ 
teractions. Such multilayer networks are ubiquitous, and 
systems as different as social networks, transportation 
networks or cellular and brain networks require a mul¬ 
tilayer description. Here, by using information theory 
tools, we have defined an indicator function 0 s able to 
measure the mesoscopic similarities between the layers 
of a multiplex network. This indicator can be used to 
quantitatively compare the layers of a multiplex network 
with respect to the mesoscopic structure induced by any 
feature depending on the layer architecture. In particu¬ 
lar here we have focused on the case in which the feature 
of the nodes is their community assignment. We have 
shown that 0 s can reveal the network between the lay¬ 
ers of a multiplex and we have applied this method to 
the Collaboration Multiplex Network at the two levels of 
the PACS hierarchy, obtaining a bottom-up approach to 
identify how the organization of knowledge in physics is 
reflected in the structure of collaboration networks. 
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FIG. 9: (Color online) Other similarity measures used to hierarchically cluster the M\ = 10 layers of the APS Collaboration 
Multiplex Network at the first level of the PACS hierarchy. The similarity matrices and their respective dendrograms cutted at 
the partition optimizing the modularity Q (red-dashed line) are shown for the Normalized Mutual Information (Panel a), Rand 
Index (Panel b), Jacquard Index (Panel c) and for the ACTIVIS Index (Panel d). Layer partitions are obtained using the 
Infomap community detection algorithm. None of the optimal partitions corresponds to the one obtained using 0 s to measure 
similarities. 
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